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Summary. This paper reviews the most popular methods which are used in lattice 
QCD to compute the determinant of the lattice Dirac operator: Gaussian integral 
representation and noisy methods. Both of them lead naturally to matrix function 
problems. We review the most recent development in Krylov subspace evaluation of 
matrix functions. The second part of the paper reviews the formal relationship and 
algebraic structure of domain wall and overlap fermions. We review the multigrid 
algorithm to invert the overlap operator. It is described here as a preconditioned 
Jacobi iteration where the preconditioner is the Schur complement of a certain block 
of the truncated overlap matrix. 



1 Lattice QCD 

Quantum Chromodynamics (QCD) is the quantum theory of interacting quarks and 
giuons. It should explain the physics of strong force from low to high energies. Due to 
asymptotic freedom of quarks at high energies, it is possible to carry out perturbative 
calculations in QCD and thus succeeding in explaining a range of phenomena. At low 
energies quarks are confined within hadrons and the coupling between them is strong. 
This requires non-perturbative calculations. The direct approach it is known to be 
the lattice approach. The lattice regularization of gauge theories was proposed by 
[Wilson 1974] . It defines the theory in an Euclidean 4-dimensional finite and regular 
lattice with periodic boundary conditions. Such a theory is known to be Lattice 
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QCD (LQCD). The main task of LQCD is to compute the hadron spectrum and 
compare it with experiment. But from the beginning it was realized that numerical 
computation of the LQCD path integral is a daunting task. Hence understanding 
the nuclear force has ever since become a large-scale computational project. 

In introducing the theory we will limit ourselves to the smallest set of definitions 
that should allow a quick jump into the computational tasks of LQCD. 

A fermion field on a regular Euclidean lattice A is a Grassmann valued function 
ipn,c{z) G G, x — {a; M ,/i = 1,...,4} G A which carries spin and colour indices 
(j, = 1, . . . , 4, c = 1, 2, 3. Grassmann fields are anticommuting fields: 

i>,M, c { x )i i ^,b{y) + ip v ,b{y)^^.A x ) = o (!) 

for tpfi ; c(x),tp v ,b{y) G G and /n, v — 1,...,4, b, c = 1,2,3. In the following we 
will denote by ip(x) G G12 the vector field with 12 components corresponding to 
Grassmann fields of different spin and colour index. 

The first and second order differences are defined by the following expressions: 

d^(x) = j^[i>{x + ae M ) - ip(x - ae M )] 

dfa{x) = ^[tp(x + ae M ) + ip(x - ae M ) - 2ip(x)] 
where a and e M are the lattice spacing and the unit lattice vector along the coordinate 
/' 1 I- 

Let U{x)n G C 3x3 be an unimodular unitary matrix, an element of the 5(7(3) 
Lie group in its fundamental representation. It is a map onto 5*?7(3) colour group of 
the oriented link connecting lattice sites x and x + ae M . Physically it represents the 
gluonic field which mediates the quark interactions represented by the Grassmann 
fields. A typical quark field interaction on the lattice is given by the bilinear form: 

^){x)U{x)^{x + ae„) (3) 

where tp{x) is a second Grassmann field associated to a; € A Lattice covariant 
differences are defined by: 

Vpip(x) = j^[U{x) tJ ,tp(x + ae M ) - U H (x - ae^^ip^x - ae M )] 

(4) 

A ti ip(x) = -ij[U(x)^(x + ae M ) + U H (x - ae^^ix - ae M ) - 2ip(x)] 

where by U H (x) is denoted the Hermitian conjugation of the gauge field U(x), which 
acts on the colour components of the Grassmann fields. The Wilson-Dirac operator 
is a matrix operator Dw{m q ,U) G C NxN . It can be defined through 12 x 12 block 
matrices [Dw(m q , U)](x, y) G C 12x12 such that: 

4 

[D w {m qi U)r]{x)=m q r{x) + Y j [^V^ q (x) - ^A^ q {x)] (5) 

where m q is the bare quark mass with the index q = 1, . . . , Nf denoting the quark 
flavour; ip q (x) denotes the Grassmann field corresponding to the quark flavour with 
mass m q \ {7^ G C 4x4 ,^i = 1,...,5} is the set of anti-commuting and Hermitian 
gamma-matrices of the Dirac-Clifford algebra acting on the spin components of the 
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Grassmann fields; N = I2L1L2L3L4 is the total number of fermion fields on a lattice 
with Li, L2, -L3, L4 sites in each dimension. Dw(jn q , U) is a non-Hermitian operator. 
The Hermitian Wilson-Dirac operator is defined to be: 

H w (m q ,U) = j 5 Dw(rn q ,U) (6) 

where the product by 75 should be understood as a product acting on the spin 
subspace. 

The fermion lattice action describing Nf quark flavours is defined by: 

N f 

S f (U,il> u ...,il> rf/ ,li u ... 1 $ N/ ) = Y i i> q (x){D w (m q ,U)](x,y)i> q (y) (7) 

9 = 1 x,yeA 

The gauge action which describes the dynamics of the gluon field and its interaction 
to itself is given by: 

S B (U) = ±'£Tr(l-U f ) (8) 

9 P 

where P denotes the oriented elementary square on the lattice or the plaquette. The 
sum in the right hand side is over all plaquettes with both orientations and the trace 
is over the colour subspace. Up is a SU (3) matrix defined on the plaquette P and g 
is the bare coupling constant of the theory. 

The basic computational task in lattice QCD is the evaluation of the path inte- 
gral: 

Nj 

Zqcd = f vh(U) JJ a ^ ; ^) e -Sf(u^ 1 ,...^ Nf ^ 1 ,...^ Nf )-s g (u) (g) 

where uh(U) and a(ip q ,ip q ) denote the Haar and Grassmann measures for the gth 
quark flavour respectively. The Haar measure is a SU(3) group character, whereas 
the Grassmann measure is defined using the rules of the Berezin integration: 

J di>n, c (x) = 0, J dVv,c(aOVv,c(z) = 1 (10) 
Since the fermionic action is a bilinear form on the Grassmann fields one gets: 

N f 

Zqcd = f a H (U)Y[detD w (m q ,U)e~ s ^ u) (11) 
■' 9=1 

Very often we take Nf = 2 for two degenerated 'up' and 'down' light quarks, 
m u — vrid- In general, a path integral has 0(e N ) computational complexity which is 
classified as an NP-hard computing problem [Wasilkowski and Wozniakowski 1996] . 
But stochastic estimations of the path integral can be done by 0(N a ) complexity 
with a > 1. This is indeed the case for the Monte Carlo methods that are used ex- 
tensively in lattice QCD, a topic which is reviewed in this volume by Mike Peardon 
[Peardon 2003]. 

It is clear now that the bottle-neck of any computation in lattice QCD is the 
complexity of the fermion determinant evaluation. A very often made approxima- 
tion is to ignore the determinant altogether. Physically this corresponds to a QCD 
vacuum without quarks, an approximation which gives errors of the order 10% in 
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the computed mass spectrum. This is called the valence or quenched approximation 
which requires modest computing resources compared to the true theory. To answer 
the critical question whether QCD is the theory of quarks and gluons it is thus 
necessary to include the determinant in the path integral evaluation. 

Direct methods to compute the determinant of a large and sparse matrix are 
very expensive and even not adequate for this class of matrices. The complexity of 
LU decomposition is 0(N 3 ) and it is not feasible for matrices with N = 1920000 
which is the case for a lattice with 20 sites across each dimension. Even 0(N 2 ) 
methods are still very expensive. Only O(N) methods are feasible for the present 
computing power for such a large problem. 

2 Gaussian integral representation: pseudofermions 

The determinant of a positive definite matrix, which can be diagonalized has a 
Gaussian integral representation. We assume here that we are dealing with a matrix 
A G C NxN which is Hermitian and positive definite. For example, A = Hw(jn q , 17) 2 . 
It is easy to show that: 



The vector field 4>(x) G C 12 ,x G A that went under the name pseudofermion field 
[Weingarten and Petcher 1981], has the structure of a fermion field but its compo- 
nents are complex numbers (as opposed to Grassmann numbers for a fermion field). 

Pseudofermions have obvious advantages to work with. One can use iterative 
algorithms to invert A which are well suited for large and sparse problems. The 
added complexity of an extended variable space of the integrand can be handled 
easily by Monte Carlo methods. 

However, if A is ill-conditioned then any 0(N a ) Monte Carlo algorithm, which 
is used for path integral evaluations is bound to produce small changes in the gauge 
field. (Of course, an 0(e N ) algorithm would allow changes of any size!) Thus, to 
produce the next statistically independent gauge field one has to perform a large 
number of matrix inversions which grows proportionally with the condition number. 
Unfortunately, this is the case in lattice QCD since the unquenching effects in hadron 
spectrum are expected to come form light quarks which in turn make the Wilson- 
Dirac matrix nearly singular. 

The situation can be improved if one uses fast inversion algorithms. This was 
the hope in the early '90 when state of the art solvers were probed and researched 
for lattice QCD [Borigi and de Forcrand 1994, Frommer et al 1994]. Although rev- 
olutionary for the lattice community of that time, these methods alone could not 
improve significantly the above picture. 

Nonetheless, pseudofermions remain the state of the art representation of the 
fermion determinant. 



3 Noisy methods 

Another approach that was introduced later is the noisy estimation of the fermion 
determinant [Bai et al, 1996, Thron et al 1998, Cahill et al, 1999, Borici 2003a]. It 




(1) 



Computational methods for the fermion determinant 5 
is based on the identity: 

detA = e TrlosA (1) 

and the noisy estimation of the trace of the natural logarithm of A. 

Let Zj G {+1, — 1,...,N be independent and identically distributed 

random variables with probabilities: 

Prob(Z j = l)=Prob(Z i = -l) = i j = l,...,N (2) 
Then for the expectation values we get: 

E(Z j ) = 0, E(ZiZfc)=*j*. j,k = l,...,N (3) 
and the following result holds: 

Proposition 3.1 Let X be a random variable defined by: 

X = Z T log AZ, Z T = {Z 1 ,Z 2 ,...,Z N ) (4) 
Then its expectation fj, and variance a 2 are given by: 

» = E(X)=TrlogA, a 2 =E[(X- M ) 2 ] = 2^[i?e(log^) 3fe ] 2 (5) 

To evaluate the matrix logarithm one can use the methods described in [Bai et al, 1996, 
Thron et al 1998, Cahill et al, 1999, Borigi 2003a]. These methods have similar com- 
plexity with the inversion algorithms and are subject of the next section. 

However, noisy methods give a biased estimation of the determinant. This bias 
can be reduced by reducing the variance of the estimation. A straightforward way 
to do this is to take a sample of estimations Xi, . . . , X p and to take as estimator 
their arithmetic mean. 

[Thron et al 1998] subtract traceless matrices which reduce the error on the de- 
terminant from 559% to 17%. [Golub 2003] proposes a promising control variate 
technique which can be found in this volume. 

Another idea is to suppress or 'freeze large eigenvalues of the fermion deter- 
minant. They are known to be artifacts of a discretized differential operator. This 
formulation reduces by an order of magnitude unphysical fluctuations induced by 
lattice gauge fields [Borigi 2003b]. 

A more radical approach is to remove the bias altogether. The idea is to 
get a noisy estimator of Tr log A by choosing a certain order statistic -XVm G 
{A(i) < X(2) ■ ■ ■ < -^O)} suc h that the determinant estimation is unbiased 
[Borigi 2003c]. More one this subject can be found in this volume from the same 
author [Borigi 2003d]. 



4 Evaluation of bilinear forms of matrix functions 



We describe here a Lanczos method for evaluation of bilinear forms of the type: 

F(b,A)=b T f(A)b (1) 
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where b € R^ is a random vector and f(s) is a real and smooth function of s £ R+. 

The Lanczos method described here is similar to the method of [Bai et al, 1996] . 
Its viability for lattice QCD computations has been demonstrated in the recent work 
of [Cahill et al, 1999]. [Bai et al, 1996] derive their method using quadrature rules 
and Lanczos polynomials. Here, we give an alternative derivation which is based on 
the approach of [Borigi, 1999b, Borigi, 2000a, Borigi, 2000b]. The Lanczos method 
enters the derivation as an algorithm for solving linear systems of the form: 

Ax = b, x£C N (2) 

Lanczos algorithm 

n steps of the Lanczos algorithm [Lanczos, 1952] on the pair (A, b) are given by 
Algorithm 1. 



Algorithm 1 The Lanczos algorithm 

Set fa = 0, q = o, qi =6/||6|| 2 
for i = 1, . . . n do 

v = Aqi 
en = q\v 

v := v - qiCti - qi-ifa-i 

A = N| 2 

= v//3i 
end for 



The Lanczos vectors qi , . . . , q„ G C can be compactly denoted by the matrix 
Qn = [<7i, ■ ■ ■ , <7n]- They are a basis of the Krylov subspace fC„ = span{6, Ab, . . . , A n_1 6}. 
It can be shown that the following identity holds: 



AQ n — QnT n -j- /3n^n + lC n , 

e n is the last column of the identity matrix 1„ 
and symmetric matrix given by: 



(Oil Pi 
Pi 012 

V 



9i=6/IHl2 (3) 
R nx ™ and T n is the tridiagonal 



•• Pn-l 

Pn-i a n 



(4) 



The matrix (4) is often referred to as the Lanczos matrix. Its eigenvalues, the so 
called Ritz values, tend to approximate the extreme eigenvalues of the original matrix 
A as n increases. 

To solve the linear system (2) we seek an approximate solution x n £ IC n as a 
linear combination of the Lanczos vectors: 



X n = QnVn, Vn € C" 

and project the linear system (2) on to the Krylov subspace /C n : 



(5) 
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QlAQ n y n = Qtb = QUi\\b\\2 
Using (3) and the orthonormality of Lanczos vectors, we obtain: 

T n y n = ei||fe|| 2 

where ei is the first column of the identity matrix l n . By substituting y n into (5) 
one obtains the approximate solution: 

x„ = Q n T~ 1 e 1 \\b\\ 2 (6) 

The algorithm of [Thron et al 1998] is based on the Pade approximation of the 
smooth and bounded function /(.) in an interval [Graves-Morris, 1979]. Without 
loss of generality one can assume a diagonal Pade approximation in the interval 
s G (0, 1). It can be expressed as a partial fraction expansion. Therefore, one can 
write: 

k — 1 

with Ck £ R, dfe > 0, k = l,...,m. Since the approximation error 0(s 2m+1 ) can 
be made small enough as m increases, it can be assumed that the right hand side 
converges to the left hand side as the number of partial fractions becomes large 
enough. For the bilinear form we obtain: 

m 
k — 1 

Having the partial fraction coefficients one can use a multi-shift iterative solver 
of [Freund, 1993] to evaluate the right hand side (8). To see how this works, we solve 
the shifted linear system: 

{A + d k t)x k = b 

using the same Krylov subspace JC n - A closer inspection of the Lanczos algorithm, 
Algorithm 1 suggests that in the presence of the shift dk we get: 

oti — a>i + d k 

while the rest of the algorithm remains the same. This is the so called shift-invariance 
of the Lanczos algorithm. From this property and by repeating the same arguments 
which led to (6) we get: 

xt ~ Qn 1 ei||ft|| a (9) 

In + (JfcJln 

A Lanczos algorithm for the bilinear form 

The algorithm is derived using the Pade approximation of the previous paragraph. 
First we assume that the linear system (2) is solved to the desired accuracy using 
the Lanczos algorithm, Algorithm 1 and (6). Using the orthonormality property of 
the Lanczos vectors and (9) one can show that: 

m m 
fc=l fc=l 
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Note however that in presence of roundoff errors the orthogonality of the Lanczos 
vectors is lost but the result (10) is still valid [Cahill et al, 1999, Golub & Strakos 1994] 
For large m the partial fraction sum in the right hand side converges to the matrix 
function f{T n ). Hence we get: 

F(b,A)fii£ n (b,A) = ||&|| 2 er/(T n ) ei (11) 

Note that the evaluation of the right hand side is a much easier task than the 
evaluation of the right hand side of (1). A straightforward method is the spectral 
decomposition of the symmetric and tridiagonal matrix T n : 

(12) 

where Q n € R nxn is a diagonal matrix of eigenvalues wi, . . . ,ui„ of T n and Z n € 
R nxn is the corresponding matrix of eigenvectors, i.e. Z n — [z\, . . . ,z n \. From (11) 
and (12) it is easy to show that (see for example [Golub & Van Loan, 1989]): 

A(b,A) = \\b\\ 2 e T l Z n f{Q n )Zle 1 (13) 

where the function /(.) is now evaluated at individual eigenvalues of the tridiagonal 
matrix T n . 

The eigenvalues and eigenvectors of a symmetric and tridiagonal matrix can be 
computed by the QR method with implicit shifts [Bai et al edts, 2000]. The method 
has an 0(n 3 ) complexity. Fortunately, one can compute (13) with only an 0(n 2 ) 
complexity. Closer inspection of eq. (13) shows that besides the eigenvalues, only 
the first elements of the eigenvectors are needed: 

n 

Mb, A) = \\b\\ 2 J2 *?</(<*) (W) 
i=i 

It is easy to see that the QR method delivers the eigenvalues and first elements of 
the eigenvectors with 0(n 2 ) complexity. 

A similar formula (14) is suggested by [Bai et al, 1996]) based on quadrature 
rules and Lanczos polynomials. The Algorithm 2 is thus another way to compute 
the bilinear forms of the type (1). 

The Lanczos algorithm alone has an 0(nN) complexity, whereas Algorithm 2 
has a greater complexity: 0{nN) + 0(n 2 ). For typical applications in lattice QCD 
the 0(n/N) additional relative overhead is small and therefore Algorithm 2 is the 
recommended algorithm to compute the bilinear form (1). 

We stop the iteration when the underlying liner system is solved to the desired 
accuracy. However, this may be too demanding since the prime interest here is the 
computation of the bilinear form (1). Therefore, a better stopping criterion is to 
monitor the convergence of the bilinear form as proposed in [Bai et al, 1996]. 

To illustrate this situation we give an example from a 12 3 x 24 lattice with 
H = 0.2, bare quark mass m q = —0.869 and a SU(3) gauge field background at bare 
gauge coupling (5 — 5.9. We compute the bilinear form (1) for: 

f(s) = logtanh^, s £ R+ (15) 

and A = Hw(m q ,U) 2 , b € IR^ and b— elements are chosen randomly from the set 
{+!,-!}■ 
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Algorithm 2 The Lanczos algorithm for computing (1). 

Set fa = 0, pi = I/H&H2, go = o, qi = pib 
for i = 1, ... do 

V — Aqi 
cti = q\v 

v := v - qnxi - qi-i/3i-i 

Pi = \\vh 
qi+i = v/fa 

Pi+l = —{PiCti + pi-lfti-l)/ fli 

if l/|pi+i| < e then 

n = i 

stop 
end if 
end for 

Set (r„)i,j = Qi, (r„) i+ i,i = (Tn)i,i+i = ft, otherwise (T n )jj = 
Compute LUi and 21, by the QL method 
Evaluate (1) using (14) 



n Lanczos Algorithm 
10 ( . . . 




Number of Iterations 

Fig. 1. Normalized recursive residual (solid line) and relative differences of (14) 
(dotted line) produced by Algorithm 2. 
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In Fig. 1 are shown the normalized recursive residuals ||6 — Ar;|| 2 / ||6|| 2 , i = 
1, . . . ,n and relative differences of (14) between two successive Lanczos steps. The 
figure illustrates clearly the different regimes of convergence for the linear system 
and the bilinear form. The relative differences of the bilinear form converge faster 
than the computed recursive residual. This example indicates that a stopping crite- 
rion based on the solution of the linear system may indeed be strong and demanding. 
Therefore, the recommended stopping criteria would be to monitor the relative dif- 
ferences of the bilinear form but less frequently than proposed by [Bai et al, 1996] . 
More investigations are needed to settle this issue. Note also the roundoff effects 
(see Fig. 1) in the convergence of the bilinear form which are a manifestation of the 
finite precision of the machine arithmetic. 



5 The link between overlap and domain wall fermions 

Wilson regularization of quarks violates chiral symmetry even for massless quarks. 
This is a serious problem if we would like to compute mass spectrum with light 
sea quarks. The improvement of the discretization helps to reduce chiral symmetry 
breaking terms for a Wilson fermion. However, one must go close to the continuum 
limit in order to benefit from the improvement programme. This is not affordable 
with the present computing power. 

The idea of [Kaplan 1992] opened the door for chiral symmetry realization at 
finite lattice spacing. [Narayanan and Neuberger 1993] proposed overlap fermions 
and [Furman and Shamir 1995] domain wall fermions as a theory of chiral fermions 
on the lattice. The overlap operator is defined by [Neuberger 1998]: 

D(m q , U) = i±^il + l^pL r>aign [H w (M, U)] (1) 

with M G (—2, 0) which is called the domain wall height, which substitutes the orig- 
inal bare quark mass of Wilson fermions. From now on we suppress the dependence 
on M, rriq and U of lattice operators for the ease of notations. 

Domain wall fermions are lattice fermions in 5-dimensional Euclidean space time 
similar to the 4-dimensional formulation of Wilson fermions but with special bound- 
ary conditions along the fifth dimension. The 5-dimensional domain wall operator 
can be given by the L$ x L5 blocked matrix: 

/a 5 D w -11 P+ -m q P- 

M= P- a 5 D w - 1 '-. 

P+ 

\ -m q P+ P- a 5 D w - tJ 

where the blocks are matrices defined on the 4-dimensional lattices and P± are 4x4 
chiral projection operators. Their presence in the blocks with dimensions N x TV 
should be understood as the direct product with the colour and lattice coordinate 
spaces. a 5 is the lattice spacing along the fifth dimension. 

These two apparently different formulations of chiral fermions are in fact closely 
related to each other [Borigi 1999]. To see this we must calculate for the domain 
wall fermions the low energy effective fermion matrix in four dimensions. This can 



P± = 



14 + 75 
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be done by calculating the transfer matrix T along the fifth dimension. Multiplying 
M from the right by the permutation matrix: 



/P+ P- 
P+ 

\P- 



P. 
PJ 



we obtain: 



75 



f(a B H w P+ - t)(P+ - m q P-) a B H w P- + 1 

a 5 H w P+ - 1 



a 5 H w P- + 1 
a 5 H w P+ - tj 



\{a 5 H w P- + 1)(P- - m q P+) 
Further, multiplying this result from the left by the inverse of the diagonal matrix: 



/a 5 H w P+ - 1 



we get: 



a 5 H w P+ - t 



m q P- 



a 5 H w P+ - 1/ 
\ 



-T 



T(m) := 

\-T(P- - m q P+) 
with the transfer matrix T defined by 

11 



-T 
i ) 



(2) 



T = 



1 - a 5 H w P+ 
By requiring the transfer matrix being in the form 

_ 1 + asHw 

it is easy to see that [Borigi 1999]: 

Tiyv — Hw - 



(l + a 3 HwP-) 



(3) 



Finally to derive the four dimensional Dirac operator one has to compute the deter- 
minant of the effective fermion theory in four dimensions: 



detD 



(l 5 ) = detT(m q ) 
detT(l) 



where the subtraction in the denominator corresponds to a 5-dimensional theory 
with anti-periodic boundary conditions along the fifth dimension. It is easy to show 
that the determinant of the L 5 x L 5 block matrix T(m q ) is given by: 
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det T(m q ) = det[(P+ - m g P_) - T Ls (P_ - m„P + 
det T(m q ) = dct[i±^i 75 (l + T L *) + I— ^1(1 - 1 



which gives: 



{L5)= 1+m. , j^m, 

2 + 2 ^I + T^b W 
In the large L5 limit one gets the Neuberger operator (1) but now the operator 
H w substituted with the operator Hw (3). Taking the continuum limit in the fifth 
dimension one gets Hw — > Hw- This way overlap fermions are a limiting theory of 
the domain wall fermions. To achieve the chiral properties as in the case of overlap 
fermions one must take the large L5 limit in the domain wall formulation. 

One can ask the opposite question: is it possible to formulate the overlap in the 
form of the domain wall fermions? The answer is yes and this is done using truncated 
overlap fermions [Borigi 2000c] . The corresponding domain wall matrix is given by: 

/ asDw-l (a 5 D w + i)P+ -m q (a B D w + l)P-\ 

(a s D w + t)P- flsflw-l 



Mi 



{a 5 D w + l)P + 

\-rn q (a 5 D w + t)P+ {a 5 D w + 1)P- a 5 D w ~ 1 



The transfer matrix of truncated overlap fermions is calculated using the same steps 
as above. One gets: 

11 + a 5 H w 

Itov = n— 

I — a^Hw 

The 4-dimensional Dirac operator has the same form as the corresponding operator 
of the domain wall fermion (4) where T is substituted with Ttov (or Hw with Hw)- 
Therefore the overlap Dirac operator (1) is recovered in the large L5 limit. 



6 A two-level algorithm for overlap inversion 

In this section we review the two-level algorithm of [Borigi 2000d]. The basic stucture 
of the algorithm is that of a preconditioned Jacobi: 

x i+1 =x r + S-\b-Dx l ), i = 0,l,... (1) 

where S n is the preconditioner of the overlap operator D given by: 

_ 1 + m q 1 - rn q ^ a k 

where ak,b k £ R, k = l,...,n are coefficients that can be optimized to give the 
best rational approximation of the sign(Hw) with the least number of terms in the 
right hand side. For example one can use the optimal rational approximation coef- 
ficients of Zolotarev [Zolotarev 1877, Petrushev and Popov 1987]. For the rational 
approximation of sign(Hw) one has: 
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D = lim S n 

n — *oo 



(3) 



In order to compute efficiently the inverse of the preconditioner we go back to 
the 5-dimensional formulation of the overlap operator (see the previous section as 
well) which can be written as a matrix in terms of 4-dimensional block matrices: 



l + m Q 1 — m q 



-1 



75 Hw 
-oil H^ + hl 



'-y 5 H w \ 



\-a n l H^ + b n l/ 

This matrix can be also partitioned in the 2x2 blocked form: 



n = 



Hu H 12 
H21 H22 



with Schur complement: 



S11 = Hu — H12H22 H21 
It is easy to show that the following statements hold: 



(4) 



(5) 
(6) 



Proposition 6.1 i) The preconditioner S n is given by the Schur complement: 

S n = Su (7) 

ii) Let Hx = V with % = (y,\i, ■ ■ ■ , X") T and r\ — (r, o, . . . , o) T . Then y is the 
solution of the linear system S n y = r. 



Using these results and keeping n fixed the algorithm of [Borigi 2000d] (known also 
as the multigrid algorithm) has the form of a two level algorithm: This algorithm 



Algorithm 3 A two-level algorithm for overlap inversion 

Set x 1 6 C^, r 1 = b, tol € R+, toll € R+ 
for i = 1, ... do 

Solve approximately H X l+1 = rf such that ||rf - H X l+1 1 12/I \r l \ [2 < toh 
x t+1 =x' + y t+1 
r r+1 = b- Dx t+1 
Stop if ||r l+1 || 2 /||6|| 2 < tol 
end for 



is in the form of nested iterations. One can see that the outer loop is Jacobi it- 
eration which contains inside two inner iterations: the approximate solution of the 
5-dimensional system and the multiplication with the overlap operator D which 
involves the computation of the sign function. For the 5-dimensional system one 
can use any iterative solver which suites the properties of H. We have used many 
forms for TC ranging from rational approximation to domain wall formulations. For 
the overlap multiplication we have used the algorithm of [Borigi, 2000b]. Our test 
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on a small lattice show that the two level algorithm outperforms with an order of 
magnitude the brute force conjugate gradients nested iterations [Borigi 2000d]. 

Since the inner iteration solves the problem in a 5-dimensional lattice with finite 
Z/5 and the outer iteration solves for the 4-dimensional projected 5-dimensional 
system with L5 —> 00, the algorithm in its nature is a multigrid algorithm along the 
fifth dimension. The fact that the multigrid works here is simply the free propagating 
fermions in this direction. If this direction is gauged, the usual problems of the 
multigrid on a 4-dimensional lattice reappear and the idea does not work. In fact, 
this algorithm with n fixed is a two grid algorithm. However, since it does not involve 
the classical prolongations and contractions it can be better described as a two-level 
algorithm. 1 
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